⚡️ Speed up function is_passthrough_request_using_router_model
by 6,776%
#2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 6,776% (67.76x) speedup for
is_passthrough_request_using_router_model
inlitellm/proxy/pass_through_endpoints/llm_passthrough_endpoints.py
⏱️ Runtime :
84.5 milliseconds
→1.23 milliseconds
(best of151
runs)📝 Explanation and details
The optimization introduces caching to eliminate expensive repeated calls to
llm_router.get_model_names()
.Key Changes:
_model_names_cache
that storesset
objects keyed by router instance IDreturn model in model_names_set
Why This Creates Massive Speedup:
The line profiler shows
llm_router.get_model_names()
was the bottleneck, taking 96% of execution time (373ms out of 389ms total). This suggests the method is expensive - likely involving I/O operations or complex data processing. By caching the converted set, we:get_model_names()
now only runs once per unique router (50 times vs 2056 times in the profile)Test Case Performance:
This optimization is particularly effective for applications that repeatedly query the same router instance with different models, which appears to be the common usage pattern based on the test scenarios.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-is_passthrough_request_using_router_model-mh1bq9p4
and push.